Map Reduce: A Survey Paper on Recent Expansion
نویسندگان
چکیده
A rapid growth of data in recent time, Industries and academia required an intelligent data analysis tool that would be helpful to satisfy the need to analysis a huge amount of data. MapReduce framework is basically designed to compute data intensive applications to support effective decision making. Since its introduction, remarkable research efforts have been put to make it more familiar to the users subsequently utilized to support the execution of massive data intensive applications. Our survey paper emphasizes the state of the art in improving the performance of various applications using recent MapReduce models and how it is useful to process large scale dataset. A comparative study of given models corresponds to Apache Hadoop and Phoenix will be discussed primarily based on execution time and fault tolerance. At the end, a high-level discussion will be done about the enhancement of the MapReduce computation in specific problem area such as Iterative computation, continuous query processing, hybrid database etc. Keywords—Map Reduce; Hadoop; Iterative Computation; Phoenix; Databases
منابع مشابه
Developing a model for simulating urban expansion based on the concept of decision risk: A case study in Babol city
Today, the study of the spatial-temporal pattern of urban physical expansion and the identification of the parameters affecting the expansion play a crucial role in urban-related decision-making and long-term planning processes. Consequently, the use of precise and efficient methods to predict the physical expansion of urban areas is of great importance. The objective of present study is to pro...
متن کاملA Survey on Accelerated Mapreduce for Hadoop
Big Data is defined by 3Vs which stands for variety, volume and velocity. The volume of data is very huge, data exists in variety of file types and data grows very rapidly. Big data storage and processing has always been a big issue. Big data has become even more challenging to handle these days. To handle big data high performance techniques have been introduced. Several frameworks like Apache...
متن کاملDistributed Parameter Map-Reduce
This paper describes how to convert a machine learning problem into a series of map-reduce tasks. We study logistic regression algorithm. In logistic regression algorithm, it is assumed that samples are independent and each sample is assigned a probability. Parameters are obtained by maxmizing the product of all sample probabilities. Rapid expansion of training samples brings challenges to mach...
متن کاملSurvey on Load Balancing and Data Skew Mitigation in Mapreduce Applications
Since few years Map Reduce programming model have shown great success in processing huge amount of data. Map Reduce is a framework for data-intensive distributed computing of batch jobs. This data-intensive processing creates skew in Map Reduce framework and degrades performance by great value. This leads to greatly varying execution time for the Map Reduce jobs. Due to this varying execution t...
متن کاملMap-Reduce Expansion of the ISGA Genomic Analysis Web Server
Biological sequence data can be subjected to a variety of analysis workflows to glean pertinent scientific insight. Recent advances in sequencing techniques have led to a deluge of biosequence data, which necessitates the use of high-performance computing resources in order to carry out analysis in a reasonable period of time. The tasks involved in creating and managing these computational jobs...
متن کامل